32 research outputs found

    Three-dimensional shapelets and an automated classification scheme for dark matter haloes

    Full text link
    We extend the two-dimensional Cartesian shapelet formalism to d-dimensions. Concentrating on the three-dimensional case, we derive shapelet-based equations for the mass, centroid, root-mean-square radius, and components of the quadrupole moment and moment of inertia tensors. Using cosmological N-body simulations as an application domain, we show that three-dimensional shapelets can be used to replicate the complex sub-structure of dark matter halos and demonstrate the basis of an automated classification scheme for halo shapes. We investigate the shapelet decomposition process from an algorithmic viewpoint, and consider opportunities for accelerating the computation of shapelet-based representations using graphics processing units (GPUs).Comment: 19 pages, 11 figures, accepted for publication in MNRA

    Accelerating incoherent dedispersion

    Full text link
    Incoherent dedispersion is a computationally intensive problem that appears frequently in pulsar and transient astronomy. For current and future transient pipelines, dedispersion can dominate the total execution time, meaning its computational speed acts as a constraint on the quality and quantity of science results. It is thus critical that the algorithm be able to take advantage of trends in commodity computing hardware. With this goal in mind, we present analysis of the 'direct', 'tree' and 'sub-band' dedispersion algorithms with respect to their potential for efficient execution on modern graphics processing units (GPUs). We find all three to be excellent candidates, and proceed to describe implementations in C for CUDA using insight gained from the analysis. Using recent CPU and GPU hardware, the transition to the GPU provides a speed-up of 9x for the direct algorithm when compared to an optimised quad-core CPU code. For realistic recent survey parameters, these speeds are high enough that further optimisation is unnecessary to achieve real-time processing. Where further speed-ups are desirable, we find that the tree and sub-band algorithms are able to provide 3-7x better performance at the cost of certain smearing, memory consumption and development time trade-offs. We finish with a discussion of the implications of these results for future transient surveys. Our GPU dedispersion code is publicly available as a C library at: http://dedisp.googlecode.com/Comment: 15 pages, 4 figures, 2 tables, accepted for publication in MNRA

    Analysing Astronomy Algorithms for GPUs and Beyond

    Full text link
    Astronomy depends on ever increasing computing power. Processor clock-rates have plateaued, and increased performance is now appearing in the form of additional processor cores on a single chip. This poses significant challenges to the astronomy software community. Graphics Processing Units (GPUs), now capable of general-purpose computation, exemplify both the difficult learning-curve and the significant speedups exhibited by massively-parallel hardware architectures. We present a generalised approach to tackling this paradigm shift, based on the analysis of algorithms. We describe a small collection of foundation algorithms relevant to astronomy and explain how they may be used to ease the transition to massively-parallel computing architectures. We demonstrate the effectiveness of our approach by applying it to four well-known astronomy problems: Hogbom CLEAN, inverse ray-shooting for gravitational lensing, pulsar dedispersion and volume rendering. Algorithms with well-defined memory access patterns and high arithmetic intensity stand to receive the greatest performance boost from massively-parallel architectures, while those that involve a significant amount of decision-making may struggle to take advantage of the available processing power.Comment: 10 pages, 3 figures, accepted for publication in MNRA

    Stationarity, soft ergodicity, and entropy in relativistic systems

    Get PDF
    Recent molecular dynamics simulations show that a dilute relativistic gas equilibrates to a Juettner velocity distribution if ensemble velocities are measured simultaneously in the observer frame. The analysis of relativistic Brownian motion processes, on the other hand, implies that stationary one-particle distributions can differ depending on the underlying time-parameterizations. Using molecular dynamics simulations, we demonstrate how this relativistic phenomenon can be understood within a deterministic model system. We show that, depending on the time-parameterization, one can distinguish different types of soft ergodicity on the level of the one-particle distributions. Our analysis further reveals a close connection between time parameters and entropy in special relativity. A combination of different time-parameterizations can potentially be useful in simulations that combine molecular dynamics algorithms with randomized particle creation, annihilation, or decay processes.Comment: 4 page

    The Radio Sky at Meter Wavelengths: m-Mode Analysis Imaging with the Owens Valley Long Wavelength Array

    Get PDF
    A host of new low-frequency radio telescopes seek to measure the 21-cm transition of neutral hydrogen from the early universe. These telescopes have the potential to directly probe star and galaxy formation at redshifts 20z720 \gtrsim z \gtrsim 7, but are limited by the dynamic range they can achieve against foreground sources of low-frequency radio emission. Consequently, there is a growing demand for modern, high-fidelity maps of the sky at frequencies below 200 MHz for use in foreground modeling and removal. We describe a new widefield imaging technique for drift-scanning interferometers, Tikhonov-regularized mm-mode analysis imaging. This technique constructs images of the entire sky in a single synthesis imaging step with exact treatment of widefield effects. We describe how the CLEAN algorithm can be adapted to deconvolve maps generated by mm-mode analysis imaging. We demonstrate Tikhonov-regularized mm-mode analysis imaging using the Owens Valley Long Wavelength Array (OVRO-LWA) by generating 8 new maps of the sky north of δ=30\delta=-30^\circ with 15 arcmin angular resolution, at frequencies evenly spaced between 36.528 MHz and 73.152 MHz, and \sim800 mJy/beam thermal noise. These maps are a 10-fold improvement in angular resolution over existing full-sky maps at comparable frequencies, which have angular resolutions 2\ge 2^\circ. Each map is constructed exclusively from interferometric observations and does not represent the globally averaged sky brightness. Future improvements will incorporate total power radiometry, improved thermal noise, and improved angular resolution -- due to the planned expansion of the OVRO-LWA to 2.6 km baselines. These maps serve as a first step on the path to the use of more sophisticated foreground filters in 21-cm cosmology incorporating the measured angular and frequency structure of all foreground contaminants.Comment: 27 pages, 18 figure

    The High Time Resolution Universe Survey VI: An Artificial Neural Network and Timing of 75 Pulsars

    Get PDF
    We present 75 pulsars discovered in the mid-latitude portion of the High Time Resolution Universe survey, 54 of which have full timing solutions. All the pulsars have spin periods greater than 100 ms, and none of those with timing solutions are in binaries. Two display particularly interesting behaviour; PSR J1054-5944 is found to be an intermittent pulsar, and PSR J1809-0119 has glitched twice since its discovery. In the second half of the paper we discuss the development and application of an artificial neural network in the data-processing pipeline for the survey. We discuss the tests that were used to generate scores and find that our neural network was able to reject over 99% of the candidates produced in the data processing, and able to blindly detect 85% of pulsars. We suggest that improvements to the accuracy should be possible if further care is taken when training an artificial neural network; for example ensuring that a representative sample of the pulsar population is used during the training process, or the use of different artificial neural networks for the detection of different types of pulsars.Comment: 15 pages, 8 figure

    Digital Signal Processing using Stream High Performance Computing: A 512-input Broadband Correlator for Radio Astronomy

    Get PDF
    A "large-N" correlator that makes use of Field Programmable Gate Arrays and Graphics Processing Units has been deployed as the digital signal processing system for the Long Wavelength Array station at Owens Valley Radio Observatory (LWA-OV), to enable the Large Aperture Experiment to Detect the Dark Ages (LEDA). The system samples a ~100MHz baseband and processes signals from 512 antennas (256 dual polarization) over a ~58MHz instantaneous sub-band, achieving 16.8Tops/s and 0.236 Tbit/s throughput in a 9kW envelope and single rack footprint. The output data rate is 260MB/s for 9 second time averaging of cross-power and 1 second averaging of total-power data. At deployment, the LWA-OV correlator was the largest in production in terms of N and is the third largest in terms of complex multiply accumulations, after the Very Large Array and Atacama Large Millimeter Array. The correlator's comparatively fast development time and low cost establish a practical foundation for the scalability of a modular, heterogeneous, computing architecture.Comment: 10 pages, 8 figures, submitted to JA
    corecore